Messenger API Design Decisions
Understand some of the primary technical considerations that will direct the design of the Messenger API.
Introduction#
The design of Messenger API will involve many services that each play a role in sending, receiving, and storing user messages. In this lesson, we’ll go over Messenger's high-level design and the workflow for handling requests of different functional requirements. Next, we’ll consider some technical aspects that affect our design decisions—for example, API architectural styles, communication protocols, and data formats.
Design overview#
The following illustration shows the components and services involved in Messenger API. Incoming requests go through three main services:
Asset service: This service handles media and documents sent via chat messages.
User data services: This service manages user profile data and the relevant metadata.
Real-time services: These services are responsible for two-way user communication, such as chat.
Apart from these services, there are also some components that facilitate these services, such as Zookeeper and messaging queue. The detailed responsibilities of each of these services and components are listed in the table below:
The following table discusses the purpose of each component in the Messenger system:
Components and Services Details
Component or Service | Details |
User data service |
|
Asset service |
|
Messages service |
|
Messaging queue |
|
Zookeeper |
|
WebSocket manager |
|
Chat server |
|
Presence server |
|
API gateway |
|
Workflow#
Let's discuss the workflow of the Messenger service in detail. First, the login process is initiated via the API gateway, where the request is authenticated and authorized based on the data fetched via the user data service. In the next step, when a sender intends to send a message to another user, the WebSocket connection is created between the sender and a chat server. The mapping between a user and the port assigned to them on a chat server is updated in the WebSocket manager. The Zookeeper checks the health of chat, presence, and WebSocket manager servers to keep a mapping of the up and live servers. This exact process is repeated for the receiver as well. The messaging queue acts as an intermediate component to take messages from one chat server and provide them to another chat server. It also stores the messages in persistent storage. Messaging queue reduces the burden on chat servers, and makes the system reliable by ensuring message delivery to the chat service in case of failures. Also, if a user is interested in retrieving (old) messages on any new or alternative device, they can utilize the messages service.
Points to Ponder
Question 2
How do we determine whether a user is online?
When a user opens the client application, it establishes a WebSocket connection with a chat server. Hence, the online or offline status is determined by checking whether the client has an active WebSocket connection with the chat server. We then store the most recent availability timestamp of a user in the database. This is because we want to display the last seen online status of the user to their online contacts. Note that we store this information in the database because we want to ensure the durability of the service.
The WebSocket connection remains open when the user is online. However, as the user goes offline, the connection is automatically terminated.
2 of 2
In order to share media files in chat, we first upload the file to persistent storage via the asset service. In the next step, the URI of the media file is shared in the chat with the intended receiver. The receiver downloads the media file via the asset service after the appropriate validation of access privileges by the backend.
One-to-one chat#
A chat between two users, User A and User B, is explained below:
The persistent connection is created between users and their respective chat servers, say chat servers 01 and 02. Remember that the WebSocket manager maintains the mapping of users with chat servers.
The chat server 01 receives the message from User A and assigns it an ID.
The chat server 01 forwards the message to the messaging queue, which is sent to chat server 02 and delivered to the user if they are online.
When the user is offline, the messaging queue stores the message in the database, which is delivered later when User B becomes online.
It is important to note that each message will always be stored permanently from the messaging queue whether the user is online or offline. This is because users can log in from different devices and retrieve messages. The process is depicted in the following figure:
Remember: Each user will establish a single connection with a chat server. Each new message sent from the client side will have a senderID, receiverID, messageID, orderNumber according to the session, and the corresponding message text. The chat servers will use this information to deliver the message in the right sequence.
Design considerations#
Let's start by deciding which architectural style is preferred to meet our requirements for the Messenger API. Next, we will focus on the protocols that govern the chat between multiple clients and the communication between other backend services. We’ll also discuss the data formats that are required to deliver the data.
Architecture styles#
Client to API gateway: The nature of these operations is resource-oriented, such as sign up, sign in, uploading media files via asset service, or any modification to the user's data via user service. Therefore, these functions can be performed via CRUD operations. We can use the REST API architecture style between the client and API gateway.
Note: Although we mention that the architectural style between the client and API gateway is REST, it should be noted that the gateway is only used for the initial connection establishment prior to using the chat service via WebSockets.
API gateway to backend services: Since we have a limited number of backend services (user and asset services), it is feasible for the API gateway to dynamically direct requests to the respective services. For relatively simple tasks, as in our case, GraphQL incurs extra complexity and might be detrimental from the maintenance perspective. On the contrary, the choice of REST architecture is suitable for accessing the backend resources efficiently.
Protocol selection#
Uploading media: We have mentioned in the previous section that we will handle chatting and media transfer activities separately. This section discusses the optimal choice of protocol for the transfer of media files. Since real-time communication demands the media to be swiftly delivered to the receiver, HTTP/2.0 is a suitable option. This is because HTTP/2.0 works on binary data, which performs the media transfer compactly and efficiently. Moreover, the multiplexing feature of HTTP/2.0 allows users to share multiple media files.
Live chat: In our design diagram, we mentioned the usage of WebSocket protocol, but several protocols can be used for live chatting. Let's discuss the reason for choosing WebSocket. Some popular protocols used for chatting include AMQP, MQTT, XMPP, and WebRTC. All these protocols have their advantages and disadvantages. However, the WebSocket has more advantages than these protocols. It allows two-way, persistent, lightweight, and stateful communication. Also, it is faster than HTTP, and most modern browsers provide built-in APIs for its support; therefore, we’ll use WebSocket for real-time communication in our API.
Note: We can deduce from the discussion above that we will simultaneously use the following two connections:
WebSocket: For live chat
HTTP/2.0: For sending media files
Data formats#
For the uploading and retrieving operations of media files, we opt for the binary format. On the other hand, for operations like updating or retrieving user data, we use JSON format. The binary format is efficient and allows higher compression for uploading media files, which reduces bandwidth requirements. JSON is compact and human-readable for the delivery of text messages.
As far as real-time communication is concerned, we use the WebSocket protocol that either uses binary or UTF-8 encoded text. Apart from this, most modern web browsers provide API for the WebSocket protocol that allows us to send even the JSON-encoded payload. Since we are designing Messenger and the user sends text messages, we have chosen the UTF-8 encoded JSON format, which is humanly readable, and WebSocket has the support for it. This data is eventually converted to binary (or plaintext) via the API provided by the browsers and application to make it transferable via WebSocket.
Points to Ponder
Question 2
Why don’t we employ a single WebSocket connection for both live chatting and transferring media files?
Using a single WebSocket connection can result in a degraded user experience. For instance, if we want to transfer a large file, it might block the user from exchanging new messages.
2 of 2
Summary#
Let’s summarize our decisions in the following table:
Design Considerations | Client to API Gateway | API Gateway to Backend Services | Between Client and Chat Server |
API architecture style | REST | REST | Event-driven |
Protocols | HTTP/2.0 | HTTP/2.0 | WebSocket |
Data formats | JSON, Binary | JSON, Binary | UTF-8 encoded JSON |
Requirements of the Messenger API
API Model for Messenger Service